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Abstract 


The SMILES Chemical Reaction Database is 
a set of files containing structural informa- 
tion about pairs of reactants and products of 
two million different chemical reactions. The 
simplified molecular-input line-entry system 
(SMILES) of representing molecular structures 
is used to represent molecular connectivity 
and stereochemical relationships as strings 
of characters, and indeed chemical reactions 
as well. The SMILES Reaction Database is 
now 186.8 MB in size, and it contains two 
million reactant-product pairs extracted from 
thousands of respected journals and patents, 
contained in six files. The reaction data en- 
tries in each file of the database occur on con- 
secutive lines of the file, which are delineated 
by newline characters. The SMILES Chemi- 
cal Reaction Database is very useful for many 
types of chemoinformatics tasks, including re- 
action planning and the application of artifi- 
cial intelligence techniques to reaction plan- 
ning and even gene therapy and other medic- 
inal chemistry. The database can be used 
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with any chemistry software capable of read- 
ing the SMILES format, which most chemistry 
software does. The SMILES Chemical Reac- 
tion Database is copyrighted by James Bon- 
nar (TTM). For more information about the 
SMILES format and its application to artifi- 
cial intelligence, see the publication Machine 
Learning Methods in Chemistry. 


Introduction 


The SMILES Chemical Reaction Database (SCRD) 
is a set of six files containing structural informa- 
tion about pairs of reactants and products of two 
million different chemical reactions. The simplified 
molecular-input line-entry system (SMILES) of rep- 
resenting molecular structures is used to represent 
molecular connectivity and stereochemical relation- 
ships as strings of characters, and indeed chemi- 
cal reactions as well. In 2007, rapid work at TTM 
(owned by James Bonnar) began on the assemblage 
of a human-reviewed chemical reaction database, 
soon after the development of the supporting image 
knowledge-extraction and spidering software was fi- 
nally achieved. The SMILES Reaction Database is 
now 186.8 MB in size, and it contains two million 
reactant-product pairs extracted from thousands of 
respected journals and patents, contained in six files. 
The reaction data entries in each file of the database 
occur on consecutive lines of the file, which are de- 
lineated by newline characters. 
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Use of the SCRD in Chemical Reaction 
Planning 


The SMILES Chemical Reaction Database is ideal for 
chemical reaction planning. In general, a practitioner 
does a substructure search on the database and con- 
trasts and compares the results with their structure 
and reaction of interest. There are many applications 
on the market that do substructure searching. Often, 
unforeseen lucrative reaction pathways are found. 


Use of the SCRD in AI Applications in 
Chemistry 


Because the SCRD is in SMILES format, it is ideal 
for the application of artificial intelligence techniques 
in reaction planning. This adds a new dimension to 
reaction planning that supercedes even the database 
itself. For more information see Machine Learning 
Methods in Chemistry. 


Use of the SCRD in Medicinal Chemistry 


Because the SCRD consists of reactions that involve 
covalent changes to reactants, its use in medicinal 
chemistry involves a new approach to the treatment 
of disease. In particular, the SCRD can be used 
to design agents that produce covalent changes in 
DNA, essentially pharmacological agents that correct 
errant genes (gene therapy). For more information, 
visit Machine Learning Methods in Chemistry. 
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Availability of the SCRD 


The SMILES Chemical Reaction Database can be 
easily obtained from payhip. You will receive a ZIP 
file containing the six files of the SCRD. Simply unzip 
the file. The price of the SCRD is less than half a 
cent per data entry. 
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